Selective processing of location-sensitive data streams

ABSTRACT

A method for processing a first data stream specifying locations of a user at different times and a second data stream specifying values of a monitored attribute at a location of interest at different times includes: receiving a location-centric trigger specifying at least one spatial predicate condition relative to the location of interest and at least one non-spatial predicate condition relevant to the location of interest, calculating a safe region that includes locations whose probability of satisfying the spatial predicate condition falls below a first threshold, calculating a safe value container that includes values whose probability of satisfying the non-spatial predicate condition falls below a second threshold, and processing the first data stream and the second data stream against the location-centric trigger, by considering only those locations that are not contained within the safe region and only those values that are not contained within the safe value container.

CROSS REFERENCE TO RELATED APPLICATION

This application is a continuation of co-pending U.S. patent applicationSer. No. 13/469,291, filed May 11, 2012, which in turn is a division ofco-pending U.S. patent application Ser. No. 12/575,371, filed Oct. 7,2009, both of which are herein incorporated by reference in theirentireties.

BACKGROUND OF THE INVENTION

The present invention relates generally to data stream processingapplications, and relates more specifically to the processing oflocation-based data streams to allow monitoring of location-sensitivedata.

The availability of inexpensive location-sensing technologies and theadvancement of wireless communication technology have led to anexplosion in location-based services. At the same time, othertechnological advancements have led to an abundance oflocation-sensitive data. Within this context, information needs may beexpressed using location-centric triggers.

For example, a user of a mobile device may install a location-centrictrigger in a location-based monitoring server for a particular gasstation. This trigger may specify spatial and non-spatial predicateconditions for activating the trigger. For example, the user may requestthat the trigger be activated when the user is within one mile of thegas station and the gas price is below four dollars. In this case,“within one mile of the gas station” is a spatial predicate condition,while “the gas price is below four dollars” is a non-spatial predicatecondition. Data relating to the gas price is “location-sensitive”because it is tied to a particular location (i.e., the gas station).

The location-based information monitoring server receives information inthe form of data streams that arrive continuously, rapidly, and in realtime from multiple sources. In the above example, these data streams mayinclude, for example, a first data stream identifying the location ofthe user at various times and a second data stream identifying the priceof gas at various times. The data streams are processed against theuser's location-specific trigger in order to determine when the triggershould be activated.

Simplistic systems process the data streams as they are received.However, if the location-based information monitoring system receives alarge number of data streams and/or processes location-centric triggersfor a large number of users, processing data streams as they arrive maydelay the delivery of information to the users because processingresources are wasted on large amounts of irrelevant data (i.e., datathat does not activate any of the triggers).

SUMMARY OF THE INVENTION

A method for processing a first data stream specifying locations of auser at different times and at least a second data stream specifyingvalues of a monitored attribute at a location of interest at differenttimes includes: receiving a location-centric trigger specifying at leastone spatial predicate condition relative to the location of interest andat least one non-spatial predicate condition relevant to the location ofinterest, calculating a safe region that includes locations whoseprobability of satisfying the spatial predicate condition falls below afirst threshold, calculating one or more safe value containers thatinclude values whose probabilities of satisfying the non-spatialpredicate conditions fall below one or more second thresholds, andprocessing the first data stream and the at least a second data streamagainst the location-centric trigger, by considering only thoselocations that are not contained within the safe region and only thosevalues that are not contained within respective safe value containersfor the corresponding non-spatial predicate conditions.

BRIEF DESCRIPTION OF THE DRAWINGS

So that the manner in which the above recited features of the presentinvention can be understood in detail, a more particular description ofthe invention may be had by reference to embodiments, some of which areillustrated in the appended drawings. It is to be noted, however, thatthe appended drawings illustrate only typical embodiments of thisinvention and are therefore not to be considered limiting of its scope,for the invention may admit to other equally effective embodiments.

FIG. 1 is a schematic diagram illustrating one embodiment of alocation-based information monitoring system, according to the presentinvention;

FIG. 2 is a flow diagram illustrating one embodiment of a method forprocessing location-sensitive data streams, according to the presentinvention;

FIG. 3A is a graph illustrating an exemplary user location in atwo-dimensional coordinate space;

FIG. 3B is a graph illustrating exemplary one-dimensional value domainsassociated with each of a plurality of monitored attributes;

FIG. 4 is a flow diagram illustrating one embodiment of a method forassisting in the processing of location-sensitive data streams,according to the present invention;

FIG. 5A is a schematic diagram illustrating a mobile user at a location;

FIG. 5B is a graph illustrating the probability density function for amobile user's motion inside a safe region;

FIG. 6 is a flow diagram illustrating one embodiment of a method forcomputing a safe region, according to the present invention;

FIG. 7A is a schematic diagram illustrating an exemplary grid cell thatis divided into four partitions or quadrants;

FIG. 7B is a schematic diagram illustrating the exemplary grid cell ofFIG. 7A, including a set of tension points obtained from the candidatepoints illustrated in FIG. 7A;

FIG. 7C is a schematic diagram illustrating the exemplary grid cell ofFIGS. 7A-B, including a set of component rectangles obtained from thetension points illustrated in FIG. 7B;

FIG. 7D is a schematic diagram illustrating the exemplary grid cell ofFIGS. 7A-C, in which a final safe region composed of the componentrectangles illustrated in FIG. 7C is selected;

FIG. 8 is a flow diagram illustrating one embodiment of a method forcomputing a safe value container, according to the present invention;

FIG. 9 is a high-level block diagram of the location-based monitoringmethod that is implemented using a general purpose computing device;

FIG. 10A illustrates the use of a single safe value container; and

FIG. 10B illustrates the use of single multiple safe value containers.

DETAILED DESCRIPTION

In one embodiment, the invention is a method and apparatus for selectiveprocessing of location-sensitive data streams. Embodiments of theinvention implement a selective approach to the processing of incomingdata streams, as opposed to processing the incoming data streams ondelivery. In particular, stream data with a low probability ofactivating a location-centric trigger is discarded without beingprocessed, allowing more server resources to be devoted to theprocessing of stream data with a greater probability of activating atrigger.

Embodiments of the invention rely on the use of “safe regions” forlocation-based stream data and “safe value containers” for monitoredstream data, where data is “safe” if it can be discarded (because it isnot likely to activate a trigger). Specifically, a “safe region” is aphysical location in which a user's location-centric triggers are notlikely to be activated. For instance, referring to the gas stationexample discussed in the background, the trigger's safe regionencompasses any location outside of an approximately one-mile radiusfrom the gas station. A “safe value container” is a range of values fora monitored attribute that is not likely to activate the user'slocation-centric triggers. For instance, referring again to the gasstation example, the trigger's “safe value container” includes any gasprices above four dollars. The probability threshold for the locationdata and the monitored data may be the same, or each may have adifferent threshold. For ease of explanation, discussion of theinvention herein refers to a single safe value container. However, thoseskilled in the art will appreciate that a location-centric trigger maybe associated with a plurality of safe value containers (e.g., one safevalue container for each non-spatial predicate condition). Thus, anyinstance discussed herein in which reference is made to a single safevalue container inherently contemplates the existence of multiple safevalue containers.

As discussed above, each mobile user u_(i)∈U expresses herlocation-based information needs in the form of location-centrictriggers t_(ij)∈T at a location of interest l_(j)∈L. In one embodiment,the triggers are installed at an information monitoring server thatreceives location updates from mobile users as well as data updates fromother data sources and processes these updates to determine if anyrelevant triggers need to be activated. The term “mobile user” and thedesignation u_(i) are used interchangeably herein to refer to both theuser of a mobile device and the mobile device itself (e.g., a mobileglobal positioning system (GPS) device, a cellular telephone, a personaldigital assistant, a laptop computer, a satellite radio receiver, or thelike).

FIG. 1 is a schematic diagram illustrating one embodiment of alocation-based information monitoring system 100, according to thepresent invention. As illustrated, the system 100 comprises aninformation monitoring server 102 and a plurality of data sources thatprovide data updates to the information monitoring server 102. Thesedata sources include location data sources 104 and monitored datasources 106.

The location data sources 104 comprise one or more base stations thatare in communication with mobile users and that provide location datarelevant to the mobile users (i.e., the locations of the mobile users atgiven times). Each of the location data sources 104 delivers updates inthe form of location data streams from multiple mobile users to theinformation monitoring server 102. A location data stream containstuples of the form l_(u) _(i) (t)=

u_(i),t,x,y

, which indicates the location of the mobile user u_(i) intwo-dimensional coordinate space at time t. In one embodiment, thelocation data sources 104 operate within a wireless environment in whichthe base stations communicate directly with the mobile users.

The monitored data sources 106 comprise one or more information deliverysystems that provide monitored data relevant to one or more locations ofinterest (e.g., gas prices at a specified gas station). Each of themonitored data sources 106 delivers updates in the form of monitoreddata streams relevant to the monitored data at different locations ofinterest to the information monitoring server 102. A monitored datastream contains tuples of the form m_(l) _(j) (t)=

l_(j),t,a₁,a₂, . . . , a_(r)

, which contains r-dimensional monitored data. a_(k), k∈[1 . . . r] is avalue in domain D_(k), which represents the value of the k^(th)monitored attribute at location of interest l_(j) at time t. In oneembodiment, the monitored data sources 106 operate within an Ethernetenvironment.

The information monitoring server 102 maintains a set T of locationcentric triggers. Each trigger t_(i,j)∈T represents a trigger installedby mobile device user u_(i)∈U on location of interest k_(j)∈L. Theinformation monitoring server 102 receives updates from the locationdata sources 104 and the monitored data sources 106 in the form ofstreaming data, processes the updates, and activates location-centrictriggers in response to the updates. As illustrated, the informationmonitoring server 102 comprises five main components, each of which mayindividually comprise a processor. These components include: a dataprocessor 108, a trigger manager 110, an optimizer 112, an eventdetector 114, and a data manager 116.

The data processor 108 receives updates directly from the location datasources 104 and the monitored data sources 106. In addition, the dataprocessor 108 receives requests to install new location-centric triggersfrom the location data sources 104. The data processor 108 classifiesthe updates according to their sources (e.g., location data source ormonitored data source) and then provides the classified updates to theoptimizer 112. The requests to install new location-centric triggers areforwarded by the data processor 108 to the trigger manager 110.

The optimizer 112 receives the classified updates from the dataprocessor 108 and determines whether the classified updates should beprocessed. As discussed in further detail below, the determination as towhether a classified update should be processed is based on theprobability that the classified update will activate (or not activate) alocation-centric trigger for at least one of the users. In oneembodiment, the optimizer 112 facilitates this determination bycomputing “safe regions” for the location data and “safe valuecontainers” for the monitored data. The optimizer 112 delivers the saferegions and safe value containers to the data manager 116.

The trigger manager 110 receives the requests to install newlocation-centric triggers from the data processor 108 and handles theaddition and removal of triggers in accordance with these requests. Inaddition, the trigger manager 110 coordinates with the optimizer 112 andthe event detector 114 in order to determine whether any triggers shouldbe activated in response to incoming updates.

The data manager 116 receives the safe regions and safe value containersfrom the optimizer 112 and communicates the safe regions back to therelevant mobile users for use in self-monitoring, discussed in furtherdetail below. The data manager 116 also stores the safe valuecontainers. In one embodiment, safe value containers are notcommunicated to the monitored data sources 106 because it is assumedthat the monitored data sources 106 do not possess computational powerthat can be devoted to self-monitoring; however, in other embodiments,the data manager 116 communicates the safe value containers to themonitored data sources 106. In addition, the data manager 116 receivesupdates for processing from the optimizer 112.

The data manager 116 delivers the safe region information, safe valuecontainer information, and updates to the event detector 114. Inaddition, the event detector 114 receives trigger information from thetrigger manager 110. The event detector 114 determines whether toactivate a trigger by processing the data received from the data manager116 against the data provided by the trigger manager 110.

FIG. 2 is a flow diagram illustrating one embodiment of a method 200 forprocessing location-sensitive data streams, according to the presentinvention. The method 200 may be implemented, for example, by theinformation monitoring server 102 illustrated FIG. 1. As such, referenceis made in the discussion of the method 200 to various components of theinformation monitoring server 102. However, it is understood that themethod 200 is not limited to operation in conjunction with theinformation monitoring server 102, and may readily be deployed insystems having different configurations from that illustrated.

The method 200 is initialized at step 202 and proceeds to step 204,where the data processor 108 receives one or more location-centrictriggers from one or more mobile users. In one embodiment, theselocation-centric triggers are received from one or more of the locationdata sources 104, such as base stations that are in communication withmobile users. Each location-centric trigger specifies a set of spatialand non-spatial predicate conditions for activating the trigger (e.g.,“Notify User A when User A is within one mile of gas station G and thegas price is below four dollars”). In one embodiment, these triggers areexpressed in the form of <monitored attribute><op><value>, where<op>∈{<,>,≦,≧}, combined using the logical ̂ operator. For instance,User A's trigger can be expressed as t_(A,G)=(x≧−1

x≦1

y≧−1

y≦1

p<4), where the first four constraints express the spatial triggerregion using the minimum bounding rectangle of a circle of one-mileradius around the gas station G, assuming that the gas station islocated at the origin of the coordinate space. The last constraintexpresses the gas price requirement. The predicate conditions specifiedon the spatial region will be a common feature of all triggers; however,different locations of interest will have different monitored attributesassociated with them. In one embodiment, the method 200 assumes that atrigger specifies predicate conditions on all of the monitoredattributes associated with the corresponding location of interest.

In one embodiment, the triggers are classified into one of threecategories depending on their relevance to the population of mobileusers: private, public, or shared. Considering a location-basedinformation monitoring system with n mobile users, private triggerst_(i,j) ^(public)∈T are relevant to a single mobile user, where i∈[1 . .. n] and |i|=1. Shared triggers t_(i,j) ^(shared)∈T are relevant to atleast two mobile users under the constraints i∈[1 . . . n] and 2≦|i|≦n′,where n′ specifies system limitations on the maximum number of mobileusers permitted to share a trigger. Public triggers t_(i,j) ^(public)∈Tare relevant to all of the mobile users, |i|=n. In a further embodiment,an additional constraint specifies that a mobile user may have only onetrigger relevant to a given location of interest l_(j).

In step 206, the data processor 108 receives one or more locationupdates from the location data sources 104. As discussed above, thelocation updates indicate the physical locations of the mobile users atgiven times (e.g., where user A is at time t).

In step 208, the data processor 108 receives one or more data updatesfrom the monitored data sources 106. As discussed above, the dataupdates indicate monitored data relevant to one or more locations ofinterest (e.g., the gas price at gas station G at time t).

In step 210, the optimizer 112 computes one or more safe regions inaccordance with the location updates and one or more safe valuecontainers in accordance with the data updates. As discussed above, asafe region is a physical location in which a mobile user'slocation-centric triggers are not likely to be activated, while safevalue container is a range of values for a given parameter that is notlikely to activate any of the mobile users' location-centric triggers.

The safe region for each user u_(i) in a set of users U may be definedas ψ(u_(i)). One specific embodiment of a method for calculating a saferegion is discussed in greater detail with respect to FIG. 6, while onespecific embodiment of a method for calculating a safe value containeris discussed in greater detail with respect to FIG. 8.

In step 212, the data manager 116 delivers the safe regions to themobile users (e.g., via the base stations within the location datasources 104). The data manager 116 also stores the safe value containers(e.g., locally).

In step 214, the event detector 114 processes the location updates andthe monitored data updates against the safe regions and the safe valuecontainers in order to produce a reduced set of updates. In particular,any location updates that indicate mobile user locations within the saferegions are discarded. This is because locations within the safe regionshave zero probability of activating any location-centric triggers. Oncethe safe regions have been delivered to mobile users (e.g., as in step212), the number of location updates that have to be discarded should begreatly reduced. This is because the mobile users can use the saferegion information to control when they send location updates, asdiscussed in further detail in connection with FIG. 4. In addition, dataupdates that indicate data values falling within the safe valuecontainers are discarded. This is because data values falling within thesafe value containers have a zero probability of activating anytriggers.

In step 216, the event detector 114 processes the reduced set of updatesagainst the location-centric triggers in order to determine whether anyof the triggers should be activated. A trigger is to be activated whenall of its spatial and non-spatial predicate conditions are satisfied.Thus, following the above example, User A's trigger t_(A,G) is activatedwhen the location updates indicate that User A's current location iswithin one mile of gas station G and the data updates indicate that theprice of gas at gas station G is below four dollars.

In step 218, the event detector 114 determines whether any triggersshould be activated, based on the processing performed in step 216. Ifthe event detector 114 concludes in step 218 that no triggers should beactivated, the method 200 returns to step 206, and the data processor108 continues to receive location updates and data updates.

Alternatively, if the event detector 114 concludes in step 218 that atleast one trigger should be activated, the method 200 proceeds to step220, and the event detector 114 activates the trigger(s) by deliveringan update to the relevant mobile user(s) (e.g., by informing User A thathe is within one mile of gas station G and that the gas price at gasstation G is under four dollars). The method 200 then returns to step206, and the data processor 108 continues to receive location updatesand data updates.

The method 200 therefore employs a selective processing approach thatdrops data updates with less than a threshold probability (e.g., zeroprobability) of activating any relevant triggers. The probability of alocation data update l_(u) _(i) (t) from a location data source beingable to activate a trigger t_(i,j), denoted by Pr[l_(u) _(i) (t)

t_(i,j)], is dependent on two factors: (1) the probability of the mobileuser entering the spatial region R_(i,j) ^(S) associated with thetrigger t_(i,j), denoted by Pr[l_(u) _(i) (t)∈R_(i,j) ^(S)]; and (2) theprobability of monitored attribute values m_(l) _(j) (t′) at thelocation of interest l_(j) satisfying the non-spatial predicateconditions specified by the installed trigger, denoted by Pr[m_(l) _(j)(t′)∈R_(i,j) ^(NS)]. Here, t′ is the time instant at which the latestlocation update from the location of interest l_(j) has been received bythe location monitoring server, such that t′<t. Thus, one has:

Pr[l _(u) _(i) (t)

t _(i,j) ]=f(Pr[l _(u) _(i) (t)∈R _(i,j) ^(S) ],Pr[m _(l) _(j) (t′)∈R_(i,j) ^(NS)])   (EQN. 1)

Similarly, the probability of a monitored data update m_(l) _(j) (t)from a monitored data source being able to activate a trigger t_(i,j),denoted by Pr[m_(l) _(j) (t)

t_(i,j)], is dependent on two factors: (1) the probability of themonitored attribute values m_(l) _(j) (t) at the location of interestl_(j) satisfying the non-spatial predicate conditions R_(i,j) ^(NS) onthe trigger t_(i,j), denoted by Pr[m_(l) _(j) (t)∈R_(i,j) ^(NS)]; and(2) the probability of the mobile user location l_(u) _(i) (t′) lyingwithin the region R_(i,j) ^(S) associated with the trigger t_(i,j),denoted by Pr[l_(u) _(i) (t′)∈R_(i,j) ^(S)]. Thus, one has:

Pr[m _(l) _(j) (t)

t _(i,j) ]=f(Pr[m _(l) _(j) (t)∈R _(i,j) ^(NS) ],Pr[l _(u) _(i) (t′)∈R_(i,j) ^(S)])   (EQN. 2)

In some embodiments, the location-based information monitoring system100 has installed therein a large number of triggers associated witheach mobile user u_(i) and each location of interest l_(j). In thisembodiment, the set of triggers T_(i)⊂T is relevant to the mobile useru_(i). Any location update from the mobile user u_(i) should beprocessed by the information monitoring server only if the probabilityof activating at least one trigger in the set T_(i), denoted by Pr[l_(u)_(i) (t)

_(≧1)T_(i)], is greater than a predefined threshold (e.g., zero). Thus,one has:

Pr[k _(u) _(i) (t)

_(≧1) T _(i)]=1−Pr]l _(u) _(i) (t)≠>T _(i)]  (EQN. 3)

where Pr[l_(u) _(i) (t)≠>T_(i)] denotes the probability of notactivating any relevant triggers in the set T_(i). Assuming that eachtrigger in the set T_(i) is independent of all other triggers, one has:

$\begin{matrix}\begin{matrix}{{\Pr \left\lbrack {{l_{u_{i}}(t)} \neq > T_{i}} \right\rbrack} = {\prod\limits_{j = 1}^{T_{i}}\left( {1 - {\Pr \left\lbrack {l_{u_{i}}(t)}\Rightarrow t_{i,j} \right\rbrack}} \right)}} \\{= {\prod\limits_{j = 1}^{T_{i}}{\left( {1 - {{\Pr \left\lbrack {{l_{u_{i}}\left( t^{\prime} \right)} \in R_{i,j}^{S}} \right\rbrack} \cdot {\Pr \left\lbrack {{m_{l_{j}}\left( t^{\prime} \right)} \in R_{i,j}^{NS}} \right\rbrack}}} \right)\left( {{EQN}.\mspace{14mu} 5} \right)}}} \\{= {\prod\limits_{j = 1}^{T_{i}}{\left( {1 - {{\Pr \left\lbrack {{l_{u_{i}}\left( t^{\prime} \right)} \in R_{i,j}^{S}} \right\rbrack} \cdot {\prod\limits_{k = 1}^{{Dim}{(R_{i,j}^{NS})}}{\Pr \left\lbrack {{a_{k}\left( t^{\prime} \right)} \in R_{i,j}^{k}} \right\rbrack}}}} \right)\left( {{EQN}.\mspace{14mu} 6} \right)}}}\end{matrix} & \left( {{EQN}.\mspace{14mu} 4} \right)\end{matrix}$

The monitored attributes associated with a trigger t_(i,j) areconsidered to be independent of each other. This allows one to representthe probability of the non-spatial predicate conditions being satisfiedas a product of the probabilities along each dimension in EQN. 6.

Similarly, in one embodiment, the set of triggers T_(i)⊂T is relevant tothe location of interest l_(j); any monitored update for this locationof interest l_(j) should be processed by the information monitoringserver 102 only if the probability of activating at least one trigger inthe set T_(j) is greater than a predefined threshold (e.g., zero). Theprobability that none of the triggers in the set T_(j) will be activatedis given by:

$\begin{matrix}\begin{matrix}{{\Pr \left\lbrack {{m_{l_{j}}(t)} \neq > T_{j}} \right\rbrack} = {\prod\limits_{i = 1}^{T_{j}}\left( {1 - {\Pr \left\lbrack {m_{l_{j}}(t)}\Rightarrow t_{i,j} \right\rbrack}} \right)}} \\{= {\prod\limits_{i = 1}^{T_{j}}{\left( {1 - {{\Pr \left\lbrack {{m_{l_{j}}(t)} \in R_{i,j}^{NS}} \right\rbrack} \cdot {\Pr \left\lbrack {{l_{u_{i}}\left( t^{\prime} \right)} \in R_{i,j}^{S}} \right\rbrack}}} \right)\left( {{EQN}.\mspace{14mu} 8} \right)}}} \\{= {\prod\limits_{i = 1}^{T_{j}}{\left( {1 - {\prod\limits_{k = 1}^{{Dim}{(R_{i,j}^{NS})}}{{\Pr \left\lbrack {{a_{k}(t)} \in R_{i,j}^{k}} \right\rbrack} \cdot {\Pr \left\lbrack {{l_{u_{i}}\left( t^{\prime} \right)} \in R_{i,j}^{S}} \right\rbrack}}}} \right)\left( {{EQN}.\mspace{14mu} 9} \right)}}}\end{matrix} & \left( {{EQN}.\mspace{14mu} 7} \right)\end{matrix}$

It can be seen from EQNs. 6 and 9 that the probability of a locationdata update or a monitored data update activating any relevant triggeris zero if and only if: (1) Pr[l_(u) _(i) (t′)⊂R_(i,j)^(S)]=0,∀j|t_(i,j)∈T_(i); or (2) Pr[a_(k)(t)∈R_(i,j) ^(k)]=0 for any ofthe k monitored attributes ∀i|t_(i,j)∈T. The safe regions for thespatial locations and the safe value containers for the non-spatialattributes allow one to quickly determine the time instants when thesetwo conditions on probability are satisfied.

A safe region ψ(u_(i)) for each mobile user u_(i) can thus be definedsuch that as long as the mobile user's location lies within the saferegion, the condition Pr[l_(u) _(i) (t′)∈R_(i,j) ^(S)]=0 holds∀j|t_(i,j)∈T_(i). Consequently, any location data updates from themobile user u_(i) may be discarded without processing, since theprobability of any relevant triggers being activated by these updates iszero. Thus:

Pr[l _(u) _(i) (t)

_(≧1) T _(i) |l _(u) _(i) (t)∈ψ(u _(i))]=0   (EQN. 10)

A safe value container can be defined for each monitored attribute a_(j)^(k),k∈[1 . . . r], relevant to each location of interest l_(j), suchthat as long as the value of a monitored attribute falls within one ofits safe value containers, denoted by δ_(k)(l_(j)), the conditionPr[m_(l) _(j) ∈R_(i,j) ^(k)]=0 holds ∀i|t_(i,j)∈T. As long as the valueof even one of the monitored attributes associated with the location ofinterest l_(j) lies within a safe value container, correspondingmonitored data updates for the location of interest l_(j) may bediscarded without processing, since the probability of any relevanttriggers being activated by these updates is zero. Thus:

Prm _(l) _(j) [l _(u) _(i) (t)

_(≧1) T _(i) |∃k∈[1 . . . r],a _(j) ^(k)∈δ_(k)(l _(j))]=0   (EQN. 11)

FIG. 3A is a graph illustrating an exemplary user location P in atwo-dimensional coordinate space 300. The spatial regions associatedwith exemplary triggers 1, 2, 3, and 4 are shown, along with the saferegion 302 containing the user at position P.

FIG. 3B is a graph illustrating exemplary one-dimensional value domainsassociated with each of a plurality of r monitored attributes.Corresponding value ranges for triggers along each of the attributes aredisplayed. The computed safe value containers are displayed under threedifferent scenarios: (1) when the value of the monitored attribute liesoutside any relevant trigger's value range (e.g., as in the case ofattribute a₁); (2) when the value of the monitored attribute lies insidea single trigger's value range (e.g., as in the case of attribute a₂);and (3) when the value of the monitored attribute lies within multipleintersecting triggers' value ranges (e.g., as in the case of attributea_(r)).

FIG. 4 is a flow diagram illustrating one embodiment of a method 400 forassisting in the processing location-sensitive data streams, accordingto the present invention. The method 400 may be implemented, forexample, at a mobile user such as a mobile GPS device.

The method 400 is initialized at step 402 and proceeds to step 404,where the mobile user delivers a location-centric trigger to aninformation monitoring server (e.g., information monitoring server 102of FIG. 1), via a base station.

In step 406, the mobile user receives a safe region from the informationmonitoring server, via the base station. The safe region, as discussedabove, indicates a physical location within which the location-centrictrigger is not likely to be activated. The mobile user stores the saferegion in step 408.

In step 410, the mobile user processes its current location against thestored safe region. In step 412, the mobile user determines whether itscurrent location falls within the safe region.

If the mobile user concludes in step 412 that its current location fallswithin the safe region, the method 400 returns to step 410, and themobile user continues to process its location against the stored saferegion. Alternatively, if the mobile user concludes in step 412 that itscurrent location does not fall within the safe region, the method 400proceeds to step 414, and the mobile user delivers a location update tothe information monitoring server, via the base station. The method 400then returns to step 410, and the mobile user continues to process itslocation against the stored safe region.

Thus, the mobile users use the safe region information to self-monitortheir locations. In particular, once a mobile user knows where its saferegion is, it can avoid sending location updates to the informationmonitoring server 102 when it knows that it is inside its safe region.Shifting the location monitoring burden from the information monitoringserver 102 to the mobile users allows the mobile users to conserveenergy and bandwidth by reducing the number of updates that must besent. The information monitoring server 102 also conserves energy andbandwidth because the number of updates that it has to process isreduced.

In one embodiment, safe regions are represented as rectangular regions.There are at least three advantages to the rectangular representation:(1) the safe region can be represented compactly through thespecification of two points (e.g., bottom-left and top-right corners),making it easy to communicate to the relevant mobile user; (2) mobileusers can quickly determine their locations within a rectangular region,which facilitates the self-monitoring of location; and (3) computationof a rectangular safe region requires relatively low processingoverhead.

FIG. 5A is a schematic diagram illustrating a mobile user at location{right arrow over (P)}, where {right arrow over (P)}_(last) indicatesthe last-recorded location of the mobile user. The probability densityfunction for the mobile user's motion inside the safe region, denoted byp(φ), is given by the following density function:

$\begin{matrix}{{p(\phi)} = \left\{ \begin{matrix}\frac{1 + {\frac{s}{t}\left\lceil \frac{{\pi/2} - {\phi }}{{s/t} \cdot \pi} \right\rceil}}{2\pi} & {{{if} - {\pi/2}} \leq \phi \leq {\pi/2}} \\\frac{1 - {\frac{s}{t}\left\lceil \frac{{\phi } - {\pi/2}}{{s/t} \cdot \pi} \right\rceil}}{2\pi} & {otherwise}\end{matrix} \right.} & \left( {{EQN}.\mspace{14mu} 12} \right)\end{matrix}$

In EQN. 12, s and t are parameters of steadiness such that s/t<1. FIG.5B is a graph illustrating the probability density function for themobile user's motion inside the safe region for s=1 and for differentvalues of t. The value of s in particular determines the weight to beassigned to the probability of the mobile user moving in the samedirection, whereas t determines the granularity of change in φ for whichthe probability value changes. As illustrated in FIG. 5B, theprobability of the mobile user moving in a direction such that 0≦φ≦π/tis the same; for values of φ>π/t, this probability decreases. Assumingrandom motion would lead to the probability of motion in any directionbeing 1/2π.

Assuming that a mobile user moves in a direction φ as illustrated inFIG. 5A with a speed of v, and assuming a convex safe region ψ(u_(i)), alast-recorded location {right arrow over (P)}_(last) and an updatedlocation {right arrow over (P)}, the average location update cost C_(u)_(i) over time can be computed as:

$\begin{matrix}{C_{u_{i}} = {C_{l} \cdot \left( {\int_{- \pi}^{\pi}\frac{{r(\phi)}{p(\phi)}{(\phi)}}{2\pi \; v}} \right)^{- 1}}} & \left( {{EQN}.\mspace{14mu} 13} \right)\end{matrix}$

where C_(l) is the cost of a single location update, φ is the anglebetween the direction of motion of the mobile user and the mobile user'slast-recorded direction of motion P_(last)P, and r(φ) is the length ofthe segment PR (where R is the point on the safe region boundary atwhich the next update is expected to occur). Given the angle φ, theelapsed time before the next update is r(φ)/v. The average elapsed timeover all values of φ is given by

$\int_{- \pi}^{\pi}{\frac{{r(\phi)}{p(\phi)}{(\phi)}}{2\pi \; v}.}$

Thus, the average location update cost C_(u) _(i) can be re-written as:

$\begin{matrix}{C_{u_{i}} = \frac{{C_{l} \cdot 2}\pi \; v}{\lambda (\phi)}} & \left( {{EQN}.\mspace{14mu} 14} \right)\end{matrix}$

where λ(φ)=∫_(−π) ^(π)r(φ)p(φ)d(φ) is the weighted perimeter of the saferegion. In order to minimize the update costs, one must maximize thevalue of the weighted perimeter. Therefore, the problem of minimizingupdate costs reduces to finding a rectangular safe region with a maximumweighted perimeter.

In one embodiment, the maximum weighted perimeter safe region for agiven mobile user is calculated by calculating the individual saferegions for each of the mobile user's location-centric triggers and thencalculating the intersection of these safe regions. In anotherembodiment, illustrated in greater detail with respect to FIG. 6, themaximum weighted perimeter safe region is calculated for a set ofrelevant spatial trigger regions for a given mobile user.

FIG. 6 is a flow diagram illustrating one embodiment of a method 600 forcomputing a safe region, according to the present invention.Specifically, the method 600 computes a maximum weighted perimeter saferegion, as described above. The method 600 may be implemented, forexample, in accordance with step 210 of the method 200 and by theinformation monitoring server 102 illustrated FIG. 1. As such, referenceis made in the discussion of the method 600 to various components of theinformation monitoring server 102. However, it is understood that themethod 600 is not limited to operation in conjunction with theinformation monitoring server 102, and may readily be deployed insystems having different configurations from that illustrated.

In one embodiment, the method 600 reduces computation costs byconsidering only relevant triggers in the vicinity of the mobile user'scurrent location. In one embodiment, this is achieved by overlaying agrid over the entire universe of discourse U (or map).

The method 600 is initialized at step and proceeds to step 604, wherethe data processor 108 receives the mobile user's current locationvector {right arrow over (P)} and the current grid cell G({right arrowover (P)}) in which the mobile user resides.

In step 606, the optimizer 112 identifies the set of triggers thatintersect the current grid cell G({right arrow over (P)}). As discussedabove, embodiments of the method 600 consider only these triggers in thecalculation of the safe region. In one embodiment, if none of the user'striggers intersect the current grid cell G({right arrow over (P)}), thenthe optimizer 112 returns the entire current grid cell G({right arrowover (P)}) as the safe region.

In step 608, the optimizer 112 partitions the current grid cell G({rightarrow over (P)}) into a plurality of partitions, with the mobile user'scurrent location {P_(x), P_(y)} as the origin. In one embodiment, theoptimizer 112 partitions the current grid cell G({right arrow over (P)})into four quadrants.

In step 610, the optimizer 112 defines, for each partition, a set ofcandidate points (cpSetPart). The set of candidate points comprises theset of points that can potentially form a corner of a rectangular saferegion. In one embodiment, the set of candidate points is defined byfirst selecting the spatial region corner of each of the mobile user'striggers as a candidate point in its appropriate partition. For triggersthat do not lie completely inside the current grid cell G({right arrowover (P)}), the intersection points of the boundary of the current gridcell G({right arrow over (P)}) and the trigger spatial region areselected as candidate points instead of the corner points (which falloutside the region of the current grid cell G({right arrow over (P)})).In a further embodiment, the set of candidate points is expanded by alsoselecting, for trigger spatial conditions that intersect the x axis or yaxis of the coordinate axes with origin at {P_(x), P_(y)}, the points ofintersection of the triggers with the axes. In a further embodimentstill, if no points of intersection with an axis exist, the point ofintersection of the axis with the current grid cell G({right arrow over(P)}) is added to the set of candidate points.

In one embodiment, the set of candidate points is trimmed. In oneembodiment, trimming includes removing, in the case of multiplecandidate points in a partition that intersect the x or y axis, allcandidate points on this axis except for the one that is closest to theorigin. In a further embodiment, all points that dominate any otherpoint in the candidate set are removed. A point P₁ dominates a point P₂if P₁·x>P₂·x and P₁·y>P₂·y. In yet another embodiment, the candidatepoints are sorted according to increasing distance of the x coordinatefrom the origin. Points with the same x coordinate are arranged in orderof decreasing distance of the y coordinate from the origin.

FIG. 7A is a schematic diagram illustrating an exemplary grid cell 700that is divided into four partitions or quadrants, labeled I-IV. Asillustrated, a plurality of candidate points (labeled C₁₁-C₄₄) has beenidentified within the grid cell 700. Black dots represent the candidatepoints that have survived trimming, as discussed above, while hollowdots represent candidate points that have been trimmed

Referring back to FIG. 6, in step 612, the optimizer 112 defines, foreach partition, a set of tension points (tpSetPart). Tension points areobtained from candidate points by ensuring that only points that formthe largest possible rectangular regions that do not overlap the spatialregion of any trigger are selected. In one embodiment, each tensionpoint T_(Qi) (where Q∈{1,2,3,4} represents the partition that thetension point belongs to) is assigned to the same x-coordinate as thecorresponding candidate point C_(Qi). T_(Qi) is assigned the samey-coordinate as that of C_(Qi-l), or T_(Qi-l) if T_(Qi) and T_(Qi-l)have the same x coordinate. The y-coordinate of T_(Qi) is set as eitherthe top bound of the current grid cell G({right arrow over (P)}) or they-coordinate of a candidate point that intersects the y axis (if anysuch candidate points exist).

FIG. 7B is a schematic diagram illustrating the exemplary grid cell 700of FIG. 7A, including a set of tension points (labeled T₁₁-T₄₄) obtainedfrom the candidate points illustrated in FIG. 7A. If one imagines anelastic band laid around the set of candidate points, the set of tensionpoints can be obtained by stretching the elastic band to obtain arectilinear polygonal shape that does not overlap any of the triggers'spatial regions.

Referring back to FIG. 6, in step 614, the optimizer 112 defines a setof component rectangles. The set of tension points forms the oppositecorner (i.e., opposite to the origin) of the set of candidate componentrectangles in each partition. The safe region that is ultimatelycalculated is composed of the intersection of the component rectanglesfrom each partition. In one embodiment, component rectangles with themaximum perimeter in each partition are selected, as this willfacilitate the calculation of safe regions with the maximum weightedperimeter.

In step 616, the optimizer 112 calculates the safe region in accordancewith the component rectangles. In one embodiment, this is accomplishedusing greedy heuristics that first select the partition in which theprobability density function of the expected future movement of themobile user is maximum. The component rectangle with the largestweighted perimeter in this partition is then selected. Partitions arefurther selected dependent on the distribution of probability densityfunction values in the partition using the steady motion assumption. Ateach step, the component rectangle with the largest weighted perimeteris selected, and this continues until all partitions are processed usingthis greedy heuristic.

FIG. 7C is a schematic diagram illustrating the exemplary grid cell 700of FIGS. 7A-B, including a set of component rectangles obtained from thetension points illustrated in FIG. 7B. As illustrated, the largestcomponent rectangle in Quadrant I is first selected, since theprobability density function values for mobile user movement are maximumin this quadrant. The next quadrant (i.e., Quadrant IV) is selected in aclockwise manner; probability density function values of motiondirection are expected to be higher in Quadrant IV because the angle θillustrated in FIG. 7C is such that θ<π/4. If the value of the angle θwas greater than π/4, Quadrant II would be selected next rather thanQuadrant IV. Addition of the component rectangle at tension point T₄₄provides a safe region with a larger perimeter than the safe regionobtained by adding the component rectangle with the tension pointT₄₁,T₄₂. Finally, the component rectangles with tension points at T₂₃ inQuadrant II and T₃₄ in Quadrant III (selected in order of increasingexpected probability density function values) are selected.

FIG. 7D is a schematic diagram illustrating the exemplary grid cell 700of FIGS. 7A-C, in which a final safe region 702 composed of thecomponent rectangles illustrated in FIG. 7C is selected.

The data manger 116 outputs the safe regions to the mobile users in step618. The method 600 then returns to step 604, and proceeds as describedabove to process a new location vector and grid cell.

As discussed above, each location-centric trigger may also have anon-spatial predicate condition that requires monitoring of anon-spatial attribute a_(j) ^(k) (e.g., the price of gas at a locationof interest). Each monitored attribute a_(j) ^(k) at a location ofinterest l_(j) has at least one safe value container δ_(k)(l_(j))associated with it. The following condition holds true for any safevalue container P_(r)[m_(l) _(j) (t)

_(≧1)T_(j)|∃k∈[1 . . . r],a_(j) ^(k)∈δ(l_(j))]=0, which implies that ifthe value of even a single monitored attribute at a location of interestlies within the corresponding safe value container, the probability ofany relevant triggers being activated is zero.

In one embodiment, the safe value containers are constructed to satisfyat least three goals: (1) quick calculation of the safe value containers(since calculation must be performed for each of the r monitoredattributes at each location of interest); (2) quick containment check toverify that the current monitored attribute value lies within aone-dimensional value range; and (3) maximization of the value rangecovered by the safe value containers, so as to minimize the number ofupdates that need to be processed by the location monitoring server 102.

In one embodiment, the safe value containers for any monitored attributecomprise either a set of multiple safe value containers or a single safevalue container. Multiple safe value containers reduce the number ofupdates for low update frequency data streams and data streams with highrates of change of data values, where the values of the monitored datamay jump from one safe value container to another.

FIGS. 10A-B, for example, respectively illustrate the use of single andmultiple safe value containers. In particular, FIG. 10A illustrates asingle safe value container being invalidated at time instant t₂ when amonitored data source delivers data exhibiting a high rate of change anda low update frequency. By contrast, FIG. 10B illustrates that the useof multiple safe value containers eliminates the need to recomputed thesafe value containers at time instants t₂ and t₃.

FIG. 8 is a flow diagram illustrating one embodiment of a method 800 forcomputing a safe value container, according to the present invention.The method 800 may be implemented, for example, in accordance with step210 of the method 200 and by the information monitoring server 102illustrated FIG. 1. As such, reference is made in the discussion of themethod 800 to various components of the information monitoring server102. However, it is understood that the method 800 is not limited tooperation in conjunction with the information monitoring server 102, andmay readily be deployed in systems having different configurations fromthat illustrated.

The method 800 is initialized at step 802 and proceeds to step 804,where the optimizer 112 receives the values of an attribute beingmonitored. For ease of explanation, the method 800 describes monitoringthe values for a single attribute; however, it will be appreciated thatthe method 800 can just as easily be implemented to monitor values formultiple attributes.

In step 806, the optimizer 112 divides the entire value domain of theattribute into a plurality of equally sized blocks. This reduces thecomputation costs associated with calculating the safe value containers,since only triggers whose attribute value range intersects the currentmonitored block are considered for the purposes of safe value containercalculation. Consider, for example, the monitored attribute valuesa={a₁, a₂, . . . a_(r)} and the corresponding blocks identified asB(a₁), B(a₂), . . . B(a_(r)), respectively. Further consider thecalculation of safe value containers for the k^(th) attribute a_(k).

In step 808, the optimizer selects the block B(a_(k)) to monitor as theblock in which the current value of attribute a_(k) lies. The method 800then proceeds to step 810, where the optimizer 112 determines the setT_(j) of triggers such that the intersection of the predicate conditionsof the triggers with the currently monitored block is not an empty set.In other words, the optimizer 112 determines the set T_(j) of triggerssuch that R_(i,j) ^(k)∩B(a_(k))≠Ø.

In step 812, the optimizer 112 determines whether, for any of thetriggers in the set T_(j) of triggers, the monitored attribute valuesatisfies the predicate conditions. In other words, the optimizer 112determines whether for any of the triggers t_(i,j)∈T_(j), m_(l) _(j)(t)·a_(k)∈R_(i,j) ^(k).

If the optimizer 112 concludes in step 812 that the monitored attributevalue does not satisfy the predicate condition for any of the triggersin the set T_(j), the safe value container is set in step 814 as thecurrently monitored block minus the union of the predicate conditionsfor all triggers that intersect the currently monitored block. In otherwords, the safe value container is set as B(a_(k))−(R_(1,j) ^(k)∩R_(2,j)^(k)∩ . . . R_(q,j) ^(k)), where q denotes the number of triggers thatintersect the currently monitored block B(a_(k)). The method 800 thenterminates in step 812.

Alternatively, if the optimizer 112 concludes in step 812 that themonitored attribute value satisfies the predicate condition for at leastone of the triggers in the set T_(j), the method 800 proceeds to step816, where the optimizer determines whether more than one trigger issatisfied by the monitored attribute value.

If the optimizer 112 concludes in step 816 that the monitored attributevalue satisfies the predicate condition for more than one of thetriggers in the set T_(j), the method 800 proceeds to step 818, and theoptimizer 112 sets the safe value container as the intersection of thepredicate conditions for all triggers whose value range contains thevalue of the currently monitored attribute. In other words, the safevalue container is set as R_(1,j) ^(k)∩R_(2,j) ^(k)∩ . . . R_(p,j) ^(k),where p denotes the number of triggers whose value range contains thevalue m_(l) _(j) (t)·a_(k). The method 800 then terminates in step 812.

Alternatively, if the optimizer 112 concludes in step 816 that themonitored attribute value satisfies the predicate condition for only oneof the triggers in the set T_(j), the method 800 proceeds to step 820,and the optimizer 112 sets the safe value container to the value rangeof the predicate condition. In other words, the safe value container isset as R_(i,j) ^(k). The method 800 then terminates in step 812.

FIG. 9 is a high-level block diagram of the location-based monitoringmethod that is implemented using a general purpose computing device 900.In one embodiment, a general purpose computing device 900 comprises aprocessor 902, a memory 904, a location-based monitoring module 905 andvarious input/output (I/O) devices 906 such as a display, a keyboard, amouse, a stylus, a wireless network access card, an Ethernet interface,and the like. In one embodiment, at least one I/O device is a storagedevice (e.g., a disk drive, an optical disk drive, a floppy disk drive).It should be understood that the location-based monitoring module 905can be implemented as a physical device or subsystem that is coupled toa processor through a communication channel.

Alternatively, the location-based monitoring module 905 can berepresented by one or more software applications (or even a combinationof software and hardware, e.g., using Application Specific IntegratedCircuits (ASIC)), where the software is loaded from a storage medium(e.g., I/O devices 906) and operated by the processor 902 in the memory904 of the general purpose computing device 900. Thus, in oneembodiment, the location-based monitoring module 905 for providing faulttolerance for stream processing applications, as described herein withreference to the preceding figures, can be stored on a computer readablestorage medium (e.g., RAM, magnetic or optical drive or diskette, andthe like).

It should be noted that although not explicitly specified, one or moresteps of the methods described herein may include a storing, displayingand/or outputting step as required for a particular application. Inother words, any data, records, fields, and/or intermediate resultsdiscussed in the methods can be stored, displayed, and/or outputted toanother device as required for a particular application. Furthermore,steps or blocks in the accompanying figures that recite a determiningoperation or involve a decision, do not necessarily require that bothbranches of the determining operation be practiced. In other words, oneof the branches of the determining operation can be deemed as anoptional step.

While the foregoing is directed to embodiments of the present invention,other and further embodiments of the invention may be devised withoutdeparting from the basic scope thereof. Various embodiments presentedherein, or portions thereof, may be combined to create furtherembodiments. Furthermore, terms such as top, side, bottom, front, back,and the like are relative or positional terms and are used with respectto the exemplary embodiments illustrated in the figures, and as suchthese terms may be interchangeable.

What is claimed is:
 1. A method for assisting in processing a pluralityof incoming data streams, the plurality of incoming data streamscomprising a first data stream that specifies one or more locations of amobile user at one or more times and a second data stream that specifiesone or more values of a monitored attribute at a location of interest atone or more times, the method comprising: delivering a location-centrictrigger to a location-based information monitoring system, thelocation-centric trigger specifying at least one spatial predicatecondition relevant to the location of interest and at least onenon-spatial predicate condition relevant to the location of interest;receiving a safe region from the location-based information monitoringsystem, the safe region comprising at least one of the one or morelocations whose probability of satisfying the spatial predicatecondition falls below a first predefined threshold; processing a currentlocation of the mobile user against the safe region; and delivering thecurrent location to the location-based information monitoring systemonly if the current location lies outside of the safe region, wherein atleast one of: the delivering the location-centric trigger, thereceiving, the processing, or the delivering the current location isperformed using a processor.
 2. The method of claim 1, wherein the saferegion has a maximum weighted perimeter.
 3. The method of claim 2,wherein the maximum weighted perimeter comprises an intersection of aplurality of safe regions associated with the mobile user.
 4. The methodof claim 1, wherein the method if performed by a mobile device operatedby the mobile user.
 5. The method of claim 1, further comprising:receiving an update from the location-based information monitoringsystem after the delivering, wherein the update indicates that thelocation-centric trigger has been satisfied.
 6. The method of claim 1,further comprising: storing the safe region on a mobile device operatedby the mobile user.
 7. The method of claim 6, further comprising:periodically re-processing the current location against the safe region.8. An apparatus comprising computer readable storage medium containingan executable program for assisting in processing a plurality ofincoming data streams, the plurality of incoming data streams comprisinga first data stream that specifies one or more locations of a mobileuser at one or more times and a second data stream that specifies one ormore values of a monitored attribute at a location of interest at one ormore times, where the program performs steps comprising: delivering alocation-centric trigger to a location-based information monitoringsystem, the location-centric trigger specifying at least one spatialpredicate condition relevant to the location of interest and at leastone non-spatial predicate condition relevant to the location ofinterest; receiving a safe region from the location-based informationmonitoring system, the safe region comprising at least one of the one ormore locations whose probability of satisfying the spatial predicatecondition falls below a first predefined threshold; processing a currentlocation of the mobile user against the safe region; and delivering thecurrent location to the location-based information monitoring systemonly if the current location lies outside of the safe region.
 9. Theapparatus of claim 8, wherein the executable program is executed by aprocessor that is part of a mobile device operated by the mobile user.10. The apparatus of claim 8, wherein the steps further comprise:receiving an update from the location-based information monitoringsystem after the delivering, wherein the update indicates that thelocation-centric trigger has been satisfied.
 11. The apparatus of claim8, wherein the steps further comprise: storing the safe region in alocation accessible to the processor.
 12. The apparatus of claim 11,wherein the steps further comprise: periodically re-processing thecurrent location against the safe region.
 13. The apparatus of claim 8,wherein the safe region has a maximum weighted perimeter.
 14. Theapparatus of claim 13, wherein the maximum weighted perimeter comprisesan intersection of a plurality of safe regions associated with themobile user.